Abstract
Introduction Chimeric antigen receptor (CAR) T-cell therapy has revolutionized treatment for relapsed/refractory multiple myeloma (R/R MM), yet durable responses remain unavailable and lack of biomarker. Lentiviral and retroviral vectors used in CAR-T manufacturing integrate into the host genome at specific sites, serving as clonal barcodes that reflect T-cell clonal identity and dynamics. Integration site (IS) analysis has revealed non-random patterns that may influence T-cell function, clonal expansion, and persistence. However, the predictive value of integration features remains unexplored. We aimed to characterize prognostic IS-related risk factors based on available datasets to construct a relapse prediction model.Methods This study identified IS features linked to CAR-T therapy prognosis by building a model on multi-source external data—400 public CAR-T cases and 103 internal records of IS in vector-infected cell lines—and validating with 24 MM patients (12 relapsed, 12 in ≥12-month remission). S-EPTS/LM-PCR enabled high-throughput IS profiling, quantitative tracking of clonal relative abundance, and analysis of clonal dynamics (e.g., clonal dominance, abnormal expansion). Multi-dimensional features were extracted: genomic features (proximity to promoters, enhancers, cancer-related genes, safe harbor regions, CpG islands, TSS); gene functional annotations (associations with oncogenes/tumor suppressors, links to severe adverse events, gene expression levels); and patient-level clonal metrics (IS richness, evenness, PMD, dominant/long-term clone status). Random forest screened high-discrimination features; binary classification models (dominant vs. non-dominant, long-term persistent vs. transient clones) were trained, validated via cross-validation and the 24-patient cohort, with AUC and Prediction accuracy as key indicator.Results Longitudinal profiling of IS across 400 external patients facilitated the identification of over 6,000 dominant or long-term persistent integration events, complemented by the characterization of more than 300,000 background integration profiles from 103 vector-infected cell lines—thereby establishing a comprehensive database/map of dominant or long-term persistent integration sites. A random forest algorithm was utilized to refine the multi-dimensional feature set into 8 key informative variables, encompassing genomics localization within gene regions, associations of target genes with oncogenes or tumor suppressors, correlations with severe adverse events, and sample-level metrics including IS richness and evenness.
In the independent test cohort of 24 R/R MM patients, the median follow-up time was 9.25 months. Among the long-term responder cohort, the median duration of response was 13 months, while the median time of relapse was 7 months in the relapsed cohort. A total of 862 dominant or long-term persistent integration events were identified; the deep learning model trained on these selected features demonstrated effective predictive capacity for dominant and long-term persistent clones, with an AUC of 0.68 and a positive predictive value of 20%. Furthermore, through comparative analysis of integration patterns between 12 responders and 12 non-responders, the model achieved an accuracy of 83.3% based on the 8 key informative variables identified during feature selection. Additionally, the model enabled preliminary risk stratification for patient-level relapse. Key IS features associated with elevated relapse risk included proximity to oncogenes, diminished clonal evenness, and persistence of dominant clones. These observations highlight the potential of IS profiles as actionable biomarkers for predicting CAR-T therapy outcomes in R/R MM.Conclusions This study establishes the first predictive model for the CAR-T response in R/R MM based on the identification of IS features associated with relapse risk, prognosis, and severe complications. The developed deep learning model, validated using multi-source datasets, demonstrates utility in stratifying patients and predicting clonal dynamics. Ongoing work to refine the model with larger cohorts will further optimize personalized CAR-T therapy strategies, thus advancing standardization and improving patient outcomes in the field.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal